change-point detection
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > United States > New York (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- Oceania > Australia (0.28)
- North America > United States (0.14)
- Indian Ocean (0.04)
- (4 more...)
- Europe > United Kingdom > England (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Overview (0.67)
- Research Report > New Finding (0.46)
M-Statistic for Kernel Change-Point Detection
Shuang Li, Yao Xie, Hanjun Dai, Le Song
Detecting the emergence of an abrupt change-point is a classic problem in statistics and machine learning. Kernel-based nonparametric statistics have been proposed for this task which make fewer assumptions on the distributions than traditional parametric approach. However, none of the existing kernel statistics has provided a computationally efficient way to characterize the extremal behavior of the statistic. Such characterization is crucial for setting the detection threshold, to control the significance level in the offline case as well as the average run length in the online case. In this paper we propose two related computationally efficient M -statistics for kernel-based change-point detection when the amount of background data is large. A novel theoretical result of the paper is the characterization of the tail probability of these statistics using a new technique based on change-of-measure. Such characterization provides us accurate detection thresholds for both offline and online cases in computationally efficient manner, without the need to resort to the more expensive simulations such as bootstrapping. We show that our methods perform well in both synthetic and real world data.
Consistent Kernel Change-Point Detection under m-Dependence for Text Segmentation
Diaz-Rodriguez, Jairo, Jia, Mumin
Kernel change-point detection (KCPD) has become a widely used tool for identifying structural changes in complex data. While existing theory establishes consistency under independence assumptions, real-world sequential data such as text exhibits strong dependencies. We establish new guarantees for KCPD under $m$-dependent data: specifically, we prove consistency in the number of detected change points and weak consistency in their locations under mild additional assumptions. We perform an LLM-based simulation that generates synthetic $m$-dependent text to validate the asymptotics. To complement these results, we present the first comprehensive empirical study of KCPD for text segmentation with modern embeddings. Across diverse text datasets, KCPD with text embeddings outperforms baselines in standard text segmentation metrics. We demonstrate through a case study on Taylor Swift's tweets that KCPD not only provides strong theoretical and simulated reliability but also practical effectiveness for text segmentation tasks.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (13 more...)
Riemannian Change Point Detection on Manifolds with Robust Centroid Estimation
Wang, Xiuheng, Borsoi, Ricardo, Breloy, Arnaud, Richard, Cédric
Non-parametric change-point detection in streaming time series data is a long-standing challenge in signal processing. Recent advancements in statistics and machine learning have increasingly addressed this problem for data residing on Riemannian manifolds. One prominent strategy involves monitoring abrupt changes in the center of mass of the time series. Implemented in a streaming fashion, this strategy, however, requires careful step size tuning when computing the updates of the center of mass. In this paper, we propose to leverage robust centroid on manifolds from M-estimation theory to address this issue. Our proposal consists of comparing two centroid estimates: the classical Karcher mean (sensitive to change) versus one defined from Huber's function (robust to change). This comparison leads to the definition of a test statistic whose performance is less sensitive to the underlying estimation method. We propose a stochastic Riemannian optimization algorithm to estimate both robust centroids efficiently. Experiments conducted on both simulated and real-world data across two representative manifolds demonstrate the superior performance of our proposed method.
- Europe > France (0.14)
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > Australia (0.28)
- North America > United States (0.14)
- Indian Ocean (0.04)
- (4 more...)
- Europe > United Kingdom > England (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Overview (0.67)
- Research Report > New Finding (0.46)
Diffusion-Based Hypothesis Testing and Change-Point Detection
Moushegian, Sean, Banerjee, Taposh, Tarokh, Vahid
Score-based methods have recently seen increasing popularity in modeling and generation. Methods have been constructed to perform hypothesis testing and change-point detection with score functions, but these methods are in general not as powerful as their likelihood-based peers. Recent works consider generalizing the score-based Fisher divergence into a diffusion-divergence by transforming score functions via multiplication with a matrix-valued function or a weight matrix. In this paper, we extend the score-based hypothesis test and change-point detection stopping rule into their diffusion-based analogs. Additionally, we theoretically quantify the performance of these diffusion-based algorithms and study scenarios where optimal performance is achievable. We propose a method of numerically optimizing the weight matrix and present numerical simulations to illustrate the advantages of diffusion-based algorithms.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)